Supp Materials: An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis
نویسنده
چکیده
منابع مشابه
An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis
In this paper, we explore theoretical properties of training a two-layered ReLU network g(x;w) = ∑K j=1 σ(w T j x) with centered d-dimensional spherical Gaussian input x (σ=ReLU). We train our network with gradient descent on w to mimic the output of a teacher network with the same architecture and fixed parameters w∗. We show that its population gradient has an analytical formula, leading to i...
متن کاملSymmetry-breaking Convergence Analysis of Certain Two-layered Neural Networks with Relu Nonlinearity
In this paper, we use dynamical system to analyze the nonlinear weight dynamics of two-layered bias-free networks in the form of g(x;w) = ∑K j=1 σ(w T j x), where σ(·) is ReLU nonlinearity. We assume that the input x follow Gaussian distribution. The network is trained using gradient descent to mimic the output of a teacher network of the same size with fixed parameters w∗ using l2 loss. We fir...
متن کاملA conjugate gradient based method for Decision Neural Network training
Decision Neural Network is a new approach for solving multi-objective decision-making problems based on artificial neural networks. Using inaccurate evaluation data, network training has improved and the number of educational data sets has decreased. The available training method is based on the gradient decent method (BP). One of its limitations is related to its convergence speed. Therefore,...
متن کاملOn the Flatness of Loss Surface for Two-layered ReLU Networks
Deep learning has achieved unprecedented practical success in many applications. Despite its empirical success, however, the theoretical understanding of deep neural networks still remains a major open problem. In this paper, we explore properties of two-layered ReLU networks. For simplicity, we assume that the optimal model parameters (also called groundtruth parameters) are known. We then ass...
متن کاملThe new implicit finite difference scheme for two-sided space-time fractional partial differential equation
Fractional order partial differential equations are generalizations of classical partial differential equations. Increasingly, these models are used in applications such as fluid flow, finance and others. In this paper we examine some practical numerical methods to solve a class of initial- boundary value fractional partial differential equations with variable coefficients on a finite domain. S...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017